Robust acoustic domain identification with its application to speaker diarization
نویسندگان
چکیده
With the rise in multimedia content over years, more variety is observed recording environments of audio. An audio processing system might benefit when it has a module to identify acoustic domain at its front-end. In this paper, we demonstrate idea identification (ADI) for speaker diarization. For this, first present detailed study various domains third DIHARD challenge highlighting factors that differentiated them from each other. Our main contribution develop simple and efficient solution ADI. work, explore embeddings task. Next, integrate ADI with diarization framework III challenge. The performance substantially improved baseline thresholds agglomerative hierarchical clustering were optimized according respective domains. We achieved relative improvement than $$5\%$$ $$8\%$$ DER core full conditions, respectively, on Track 1 evaluation set.
منابع مشابه
Robust Speaker Diarization for meetings
This thesis shows research performed into the topic of speaker diarization for meeting rooms. It looks into the algorithms and the implementation of an offline speaker segmentation and clustering system for a meeting recording where usually more than one microphone is available. The main research and system implementation has been done while visiting the International Computes Science Institute...
متن کاملSpeaker Diarization in Meetings Domain
The purpose of this study is to develop robust techniques for speaker segmentation and clustering with focus on meetings domain. The techniques examined can however be applied to any other domains such as telephone and broadcast news. Traditional techniques for speaker diarization developed for telephone conversations or broadcast news are based on a single channel, which is notably different f...
متن کاملSpeaker Diarization Using a priori Acoustic Information
Speaker diarization is usually performed in a blind manner without using a priori knowledge about the identity or acoustic characteristics of the participating speakers. In this paper we propose a novel framework for incorporating available a priori knowledge such as potential participating speakers, channels, background noise and gender, and integrating these knowledge sources into blind speak...
متن کاملA sticky HDP-HMM with application to speaker diarization
We consider the problem of speaker diarization, the problem of segmenting an audio recording of a meeting into temporal segments corresponding to individual speakers. The problem is rendered particularly difficult by the fact that we are not allowed to assume knowledge of the number of people participating in the meeting. To address this problem, we take a Bayesian nonparametric approach to spe...
متن کاملRobust Unsupervised Speaker Segmentation for Audio Diarization
Audio diarization Reynolds & Carrasquillo (2005) is the process of partitioning an input audio stream into homogeneous regions according to their specific audio sources. These sources can include audio type (speech, music, background noise, ect.), speaker identity and channel characteristics. With the continually increasing number of larges volumes of spoken documents including broadcasts, voic...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Speech Technology
سال: 2022
ISSN: ['1381-2416', '1572-8110']
DOI: https://doi.org/10.1007/s10772-022-09990-9